Systems and methods for robust max consensus for wireless sensor networks

ABSTRACT

Various embodiments of systems and methods for robust max consensus for wireless sensor networks in the presence of additive noise by determining and removing a growth rate estimate from state values of each node in a wireless sensor network are disclosed.

CROSS REFERENCE TO RELATED APPLICATIONS

This is a non-provisional application that claims benefit to U.S.provisional application Ser. No. 62/959,564 filed on Jan. 10, 2020,which is herein incorporated by reference in its entirety.

FIELD

The present disclosure generally relates to wireless networks, and morespecifically to systems and methods for robust max consensus forwireless sensor networks.

BACKGROUND

A wireless sensor network (WSN) is a distributed network consisting ofmulti-functional sensors, which can communicate with neighboring sensorsover wireless channels. Estimating the statistics of sensor measurementsin WSNs is necessary in detecting anomalous sensors, supporting thenodes with insufficient resources, network area estimation, and spectrumsensing for cognitive radio applications, just to name a few. Knowledgeof extremes is often used in algorithms for outlier detection,clustering, classification, and localization. However, several factorssuch as additive noise in wireless channels, random link failures,packet loss and delay of arrival significantly degrade the performanceof distributed algorithms. Hence it is important to design and analyzeconsensus algorithms robust to such adversities.

Although max consensus has been previously studied, the analysis of maxconsensus algorithms under additive channel noise and randomly changingnetwork conditions has not received much attention. The presentdisclosure starts with a review of the literature on max consensus inthe absence of noise. A distributed max consensus algorithm for bothpairwise and broadcast communications is introduced and also provides anupper bound on the mean convergence time. Recent works consider pairwiseand broadcast communications with asynchronous updates and significantlyimprove the tightness of the upper bound on the mean convergence time.The convergence properties of max consensus protocols have been studiedfor broadcast communications setting in distributed networks. Theconvergence of average and max consensus algorithms in time dependentand state dependent graphs have also been analyzed Asynchronous updatesin the presence of bounded delays have been considered. Max-plus algebrais used to analyze convergence of max-consensus algorithms fortime-invariant communication topologies in previous works, and forswitching topologies in other works, both in the absence of noise.Distributed algorithms to reach consensus on general functions in theabsence of noise are studied in previous works. A one-parameter familyof consensus algorithms over a time-varying network has been proposed,where consensus on the minimum of the initial measurements can bereached by tuning a design parameter. A distributed algorithm to reachconsensus on general functions in a network is presented in some works,where the weighted power mean algorithm is used to calculate the maximumof the initial measurements by setting the design parameter to infinity.

A system model with imperfect transmissions has also been considered,where a message is received with a probability 1−p. This model isequivalent to the time-varying graphs, where each edge is deletedindependently with a probability p. However, that system model does notconsider errors in transmission, but only consider transmission failures(erasures).

Other works consider the presence of additive noise in the network andpropose an iterative soft-max based average consensus algorithm toapproximate the maximum, which uses non-linear bounded transmissions inorder to achieve consensus. This algorithm depends on a design parameterthat controls the trade-off between the max estimation error andconvergence speed. However, the convergence speed of this soft-max basedmethod is limited compared to the more natural max-based methodsconsidered herein.

It is with these observations in mind, among others, that variousaspects of the present disclosure were conceived and developed.

BRIEF DESCRIPTION OF THE DRAWINGS

The patent or application file contains at least one drawing executed incolor. Copies of this patent or patent application publication withcolor drawing(s) will be provided by the Office upon request and paymentof the necessary fee.

FIG. 1 is a simplified illustration showing a system for determiningrobust max consensus for a wireless sensor network;

FIG. 2 is a flowchart illustrating a methodology for determining robustmax consensus for the wireless sensor network of FIG. 1 by the system ofFIG. 1;

FIG. 3 is a graphical depiction of a network with=75 nodes;

FIG. 4 is a graphical comparison of upper bound, lower bound and a maxupdate for all nodes with

(0,1) additive noise for a fixed graph with N=75;

FIG. 5 is a set of random graphs with N=75 and edge deletion probabilityof p=0.5;

FIG. 6 is a first graphical comparison of upper bound, empirical upperbound and max consensus growth rate for the network;

FIG. 7 is a second graphical comparison of Upper bound, empirical upperbound and max consensus growth rate for the network;

FIG. 8 is a graphical representation of the performance of an algorithmin the presence of additive noise from

(0,1) for fixed graphs;

FIG. 9 is a graphical representation of the performance of the algorithmin the presence of additive noise from

(0,1) for random graphs with probability of edge deletion p=0.5; and

FIG. 10 is a graphical comparison of the algorithm and a soft-max basedaverage consensus algorithm (SMA).

FIG. 11 is an exemplary computer system for effectuating thefunctionalities analyzing max consensus algorithms in the presence ofadditive noise. Corresponding reference characters indicatecorresponding elements among the view of the drawings. The headings usedin the figures do not limit the scope of the claims.

DETAILED DESCRIPTION

The present disclosure discloses systems and methods for analysis of maxconsensus algorithms in the presence of additive noise and a design offast max-based consensus algorithms executed by a processor. Due toadditive noise, the estimate of the maximum at each node has a positivedrift. This results in nodes diverging from the true max value. Max-plusalgebra is used to represent this ergodic process of recursive max andaddition operations on the state values. This growth rate has been shownto be a constant for stochastic max-plus systems using the subadditiveergodic theorem, in a mathematics context that does not considermax-consensus. In order to study the growth rate, large deviation theoryis used and an upper bound is derived for a general noise distributionin the network. The upper bound is shown to depend linearly on thestandard deviation, and is a function of the spectral radius of thenetwork. Since the noise variance and spectral radius are not knownlocally at each node, a two-run algorithm is proposed to locallyestimate and compensate for the growth rate, and analyze its variance.

Further, the present disclosure includes the complete proof of upperbound on the growth rate and also extends the analysis by deriving alower bound. An empirical upper bound, which includes an additionalcorrection factor that depends on number of nodes is shown to be tightercompared to previously found correction factors. Additionally, the upperand lower bounds for time-varying random graphs are derived, which modeltransmission failures, and additive noise. Furthermore, a method todirectly calculate the upper bound, without solving for the largedeviation rate function of the noise, is presented. Also, usingconcentration inequalities it is shown that the variance of the growthrate estimator decreases inversely with the number of iterations andthis is used to bound the variance of the estimator. Throughsimulations, it is shown that the proposed algorithm converges muchfaster with lower estimation error, in comparison to existingalgorithms.

System Model

A network of N nodes is considered. The communication among nodes ismodeled as an undirected graph

=(

, ε), where

={1, . . . , N} is the set of nodes and E is the set of edges connectingthe nodes. The set of neighbors of node i is denoted by

_(i)={j|{i,j}∈ε}. The degree of the i^(th) node, denoted by d_(i)=|

_(i)|, is the number of neighbors of the i^(th) node. The degree matrixD, is a diagonal matrix that contains the degrees of the nodes along itsdiagonal. The connectivity structure of the graph is characterized bythe adjacency matrix A, with entries |A|_(i,j)=1 if {i,j}∈ε and|A|_(i,j)=0, otherwise. Spectral radius of the network ρ, corresponds tothe eigenvalue with the largest magnitude of the adjacency matrix A.

In the present disclosure, the following standard assumptions on thesystem model are considered:

-   -   Each node has a real number which is its own initial        measurement.    -   At each iteration, nodes broadcast their state values to their        neighbors in a synchronized fashion. The analysis and the        algorithm can be extended to asynchronous networks, assuming        that the communication time is small such that the collisions        are absent between communicating nodes.    -   Communications between nodes is analog over the wireless channel        and is subject to additive noise.    -   General model of time-varying graphs are considered, wherein, a        message corrupted by additive noise is received with a        probability 1−p, in order to model the imperfect communication        links.

A system model with imperfect transmissions is considered where amessage is received with a probability 1−p, unaffected by thecommunication noise. Note that, the system model is a more general modelthat not only considers transmission failures (erasures), but also theerrors in transmission due to imperfect communication links or fadingchannels.

FIG. 1 illustrates an overview of the system 100 including an examplenetwork 102 in communication with a controller or processor 104. Asshown, the network 102 includes a plurality of nodes 110, each node 110interconnected by a respective wireless connection 120. Nodes 110 areeach operable for measuring and/or updating a state value x_(i)(t).Wireless connections 120 each contribute noise to the state values oftheir corresponding nodes 110 modeled by v_(i,j)(t) where i and j areneighboring nodes.

Problem Statement

The goal is to have each node reach consensus on the maximum of the nodeinitial measurements in a distributed network, in the presence ofadditive communication noise. In some existing max consensus algorithms,at each iteration a node updates its state value by the maximum of thereceived values from its neighbors. After a number of iterations whichis on the order of the diameter of the network, each node reaches aconsensus on the maximum of the initial measurements. However, thisapproach fails in the presence of additive noise on the communicationlinks, because every time a node updates its state value by taking themaximum over the received noisy measurements, the state value of thenode drifts.

To address this problem, max-plus algebra and large deviation theory areused to find the growth rate of the state values. An algorithm is thenproposed which locally estimates the growth rate and updates the statevalues accordingly to reach consensus on the true maximum value.

Mathematical Background

For completeness, the mathematical background including the max-basedconsensus algorithm and max-plus algebra is briefly reviewed.

Review of Max-Based Consensus Algorithm

In this section, the conventional max-based consensus algorithm isdescribed. Consider a distributed network with N nodes with real-valuedinitial measurements, x(0)=[x₁(0), . . . , x_(N)(0)]^(T), where x_(i)(t)denotes the state value of the i^(th) node at time t. Max consensus inthe absence of noise merely involves updating the state value of nodeswith the largest received measurement thus far in each iteration so thatthe nodes reach consensus on the maximum value of the initialmeasurements. Let v_(ij)(t) be a zero mean, independent and identicallydistributed (i.i.d) noise sample from a general noise distribution,which models the additive communication noise between nodes i and j attime t. To reach consensus on the maximum of the initial state values,nodes update their state by taking the maximum over the receivedmeasurements from neighbors and their own state, given by,

$\begin{matrix}{{x_{i}( {t + 1} )} = {{{\max( {{x_{i}(t)},{\max( {{x_{i}(t)} + {\upsilon_{ij}(t)}} )}} )}.j} \in \mathcal{N}_{i}}} &  1 )\end{matrix}$

Review of Max Plus Algebra

Max plus algebra, which can be used to represent max consensus algorithmas a discrete linear system, is briefly introduced. A max-plus approachwas considered for max consensus in previous works but in the absence ofadditive noise. The approach of the present disclosure considers thepresence of a general noise distribution and study its effects onequation (1) using max-plus algebra and subadditive ergodic theory.

Max plus algebra is based on two binary operations ⊕ ⊗, on the set

_(max)=

∪{−∞}. The operation are defined on x, y∈

_(max) as follows,x⊕y=max(x,y) and x⊗y=x+y

The neutral element for the ⊕ operator is ε:=−∞ and for ⊗ operator ise:=0. Similarly for matrices X,Y∈

_(max) ^(N+N), operations are defined as, for i=1, . . . , N. and j=1, .. . , N.

${\lbrack {X \oplus Y} \rbrack_{i,j} = {\lbrack X\rbrack_{i,j} \oplus \lbrack Y\rbrack_{i,j}}},{{\lbrack {X \otimes Y} \rbrack_{i,j}\underset{k = 1}{\overset{N}{\otimes}}( {\lbrack X\rbrack_{i,k} \otimes \lbrack Y\rbrack_{k,j}} )} = {\max\limits_{k}( {\lbrack X\rbrack_{i,k} + \lbrack Y\rbrack_{k,j}} )}}$

where [X]_(i,j) and [Y]_(i,j), denote (i,j) element of matrices X and Y,respectively. For integers k>l, Y(k)=Y(k−1)⊗ . . . Y(l) is denoted.

Consider x(t) to be an N×1 vector with the state values of the nodes attime t. Max plus algebra can be used to represent equation (1) as,

$\begin{matrix}\begin{matrix}{{{x( {t + 1} )} = {{W(t)} \otimes {x(t)}}},{t > 0},} \\{{{= {\underset{\underset{\overset{\Delta}{=}{W{({t,0})}}}{︸}}{{{W(t)} \otimes {W( {t - 1} )} \otimes \;\ldots}\; W(0)} \otimes {x(0)}}},}\;}\end{matrix} &  2 )\end{matrix}$

where W(t) is the N×N noise matrix at time t, with elements

$\begin{matrix}{\lbrack {W(t)} \rbrack_{ij} = \{ \begin{matrix}e & {i = j} \\{ɛ,} & {{{if}\mspace{14mu}\{ {i,j} \}} \notin \mathcal{E}} \\{{\upsilon_{ij}(t)},} & {{{if}\mspace{14mu}\{ {i,j} \}} \in \mathcal{E}}\end{matrix} } &  3 )\end{matrix}$

Existence of Linear Growth

In a queuing theory and networking context, reference [17], [18] showthat for a system represented by the recursive relation in equation (2),x_(i)(t) grows linearly, in the sense there exists areal number λ suchthat, for all i=1, . . . , N,

$\begin{matrix}{{\lambda = {\lim\limits_{tarrow\infty}{\frac{1}{t}{x_{i}(t)}}}}{{\lambda = {\lim\limits_{tarrow\infty}{\frac{1}{t}{{\mathbb{E}}\lbrack {x_{i}(t)} \rbrack}}}},}} &  4 )\end{matrix}$

where the first limit converges almost surely. Note that the constant λdoes not depend on the initial measurement x(0), or the node index i. Itis also sometimes referred to as the max-plus Lyapunov exponent of therecursion in equation (2).

In the current WSN context, the growth of x_(i)(t) is clearly dependenton the distribution of noise and graph topology. However, there existsno analytical expressions for the growth rate λ, even for the simplestgraphs and noise distributions. Indeed this is related to along-standing open problem in the first and last passage percolation[26] to obtain analytical expressions for λ. One of the maincontributions herein is analytical bounds on λ for arbitrary graphs andgeneral noise distributions. Theorems are introduced to upper and lowerbound the growth rate for arbitrarily connected fixed and random graphs.

Bounds on Growth Rate for Fixed Graphs

Upper Bound

To derive the upper bound on the growth rate, the following theorem isprovided for fixed graphs and general noise distributions. Beforestating the theorem, the following Lemma is introduced which will belater invoked in the theorem.

Lemma 1. Let A be the adjacency matrix and p be the spectral radius,then [A^(t)]_(i,j)≤p^(t).

Proof: Consider a singular value decomposition (SVD) of A=UΣV^(T) Sothat A^(t)=(UΣV^(T))(UΣV^(T)) . . . (UΣV^(T)) t times. Let e_(i) be aunit vector of zeros, except a 1 at the i^(th) position. Hence,[A^(t)]_(i,j)=e_(i) ^(T)(UΣV^(T))^(t)e_(j) can be written and it can beshown that [A^(t)]_(i,j)≤p^(t) by showing |e_(i)^(T)p^(−t)A^(t)e_(j)|≤1. To this end, it may be written that,p ^(−t) A ^(t)=(UΣV ^(T))^(t),

where, Σ=p⁻¹Σ is a diagonal matrix with diagonal elements

$( {1,\frac{p_{2}}{p},\ldots\;,\frac{p_{N}}{p}} ),$where p_(n) is the is the n^(th) largest singular value of Σ. Since Uand V^(T) are unitary, it is clear that E is a contraction so that∥Ux∥=∥x∥,∥V ^(T) x∥=∥x∥,∥Σx∥≤∥x∥  5)

because

${{\frac{p_{n}}{p}} < 1},$for n=2, . . . , t. Now, successive application of equation (5) yields,

$\begin{matrix}{{e_{i}^{T}p^{- t}A^{t}e_{j}} = {{e_{i}^{T}p^{- t}A^{t}e_{j}}}} \\{= {{{{e_{i}^{T}( {U\;\overset{\_}{\Sigma}\; V^{T}\ldots\; U\;\overset{\_}{\Sigma}\; V^{T}} )}e_{j}}} \leq 1.}}\end{matrix}$

where the first equality is because A has non-negative entries and theinequality uses equation (5) and Cauchy-Schwartz inequality. Hence,[A^(t)]_(i,j)≤p^(t) which concludes the proof of Lemma 1.

Theorem 1. (Upper Bound) Suppose the moment generating function of thenoise M(γ):=

[e^(γuij(y))] exists for γ in a neighborhood of the origin. Then, anupper bound on growth rate λ is given by,

$\begin{matrix}{{\lambda \leq {\inf\{ {x\text{:}{\sup\limits_{0 \leq \beta \leq 1}\lbrack {{{H(\beta)} + {\beta\mspace{11mu}{\log(p)}} - {\beta\;{I( \frac{x}{\beta} )}}} < 0} \rbrack}} \}}},} &  6 )\end{matrix}$

-   -   where, p is the spectral radius of the graph, H(l) is the binary        entropy function given by        H(β)=−β log(β)−(1−β)log(1−β),    -   and I(x) is the large deviation rate function of the noise,        given by,

${{I(x)}:} = {\sup\limits_{\gamma > 0}( {{x\;\gamma} - {\log( {M(\gamma)} )}} )}$

Proof: The proof begins by describing the approach taken to prove thetheorem. Formulating growth rate λ is taken as a function of the maximumpath sum of random variables. Next, to find the maximal path sum, thenumber of paths in t hops are that involves l self-loops are counted.The upper bound are put in the desired form using large deviationtheory. The different parts of the proof are labeled accordingly, forreadability.

Relate λ and maximal path sum: To prove Theorem 1, λ is upper boundedusing the elements of W(t,0) defined in equation (2). The i,j entry[W(t,0)]_(i,j), can be written as the maximum of the sum of noisesamples over certain paths. To be precise, let P_(t)(i,j) be the set ofall path sequences

${\{ {p(k)} \}\frac{t}{k = 0}},$the start at p(0)=j and end at p(t)=i, and also satisfies(p(k),p(k+1))∈ε or p(k)=p(k+1) for k∈{0,1, . . . , t−1}, which allowsself loops. For simplicity M_(t) ^((i,j))

[W(t,0)]_(i,j) is defined. The path sum M_(t) ^((i,j)) corresponds tothe path whose sum of i.i.d noise samples along the edges in t hopsbetween nodes i and j, is maximum among all possible paths, is given by,

$\begin{matrix}{M_{t}^{({i,j})}\overset{\bigtriangleup}{=}{\lbrack {W( {t,0} )} \rbrack_{i,j} = {\max\limits_{\{{{p(k)} \in {P_{t}({i,j})}}\}}{\sum\limits_{k = 0}^{t}{\lbrack {W(k)} \rbrack_{{p(k)},{p({k + 1})}}.}}}}} &  7 )\end{matrix}$

For the system defined by the recursive relation in equation (2), let usdefine the growth rate of this max-plus process to be λ and derive anupper bound on λ. λ is related to M_(t) ^((i,j)) by first recalling thedefinition in equation (4),

$\begin{matrix}{\lambda = {{\lim\limits_{tarrow\infty}{\frac{1}{t}{x_{i}(t)}}} = {{\lim\limits_{tarrow\infty}{\frac{1}{t}{\max\limits_{j}{( {M_{t}^{({i,j})} + {x_{j}(0)}} ) \cdot}}}} \leq {\max\limits_{j}( {{\underset{tarrow\infty}{\lim\mspace{11mu}\sup}\frac{1}{t}M_{t}^{({i,j})}} + {\underset{tarrow\infty}{\lim\mspace{11mu}\sup}\frac{x_{j}(0)}{t}}} )} \leq {\max\limits_{j}{\underset{tarrow\infty}{\lim\mspace{11mu}\sup}\frac{1}{t}{M_{t}^{({i,j})}.}}}}}} &  8 )\end{matrix}$

In fact, Kingman's subadditive ergodic theorem can be invoked [17] toshow that the lim sup in the last inequality be replaced by a limit.Furthermore, as shown in same reference, this limit is independent of i,and j. Hence, one can work with M_(t) ^((i,j)) instead of x_(i)(t) toupperbound the graph-dependent constant λ. This enables us to drop themaximum over j and study the constant that M_(t) ^((i,j))/t convergesto. Toward this goal, consider the smallest value of x for which

$\begin{matrix}{{\lim\limits_{tarrow\infty}{P\lbrack {{\frac{1}{t}M_{t}^{({i,j})}} > x} \rbrack}} = {0.}} &  9 )\end{matrix}$

This probability is upperbounded to find bounds on such values of x.

Count the number of paths with I self-loops: Examining equation (7) itis observed that, for a self-loop at time k, p(k)=p(k+1). Since[W]_(i,i)(k)=e=0, there is no contribution to the sum in equation (7),as self-loops are not affected by the noise. So it is useful to expressthe maximum in equation (7) over the paths that have a fixed number ofself-loops l. To study this case, the number of paths that contain lself loops are counted. Consider the expression [(A+zI)^(t)]_(i,j),where z is an indeterminate variable that will help count the number ofpaths from node i to node j in t steps that go through a fixed number ofl self-loops. Using the binomial expansion, the following can bewritten:

$\lbrack ( {A = {z\; I}} )^{t} \rbrack_{i,j} = {\sum\limits_{l = 0}^{t}{{z^{l}\lbrack A^{t - l} \rbrack}_{i,j}\begin{pmatrix}t \\l\end{pmatrix}}}$

where co-efficient of z^(l) is the number of paths from node i to j in tsteps, that go through l self loops denoted as

$n_{l} = {{\begin{pmatrix}t \\l\end{pmatrix}\lbrack A^{t - \iota} \rbrack}_{i,j}.}$

Upper bound the growth rate λ: Now the following can be written,

$\begin{matrix}{{\frac{1}{t}M_{t}^{({i,j})}} = {\begin{matrix}\max \\{l \in \{ {0,1,\ldots\;,{t - 1}} \}}\end{matrix}{\max( {\frac{S_{1}^{(l)}}{t},\ldots\;,\frac{S_{n_{l}}^{(l)}}{t}} )}}} &  10 )\end{matrix}$

where S_(q) ^((l)) is any sum in equation (7) that involves l selfloops, q∈{1, . . . , n_(l)} and n_(l) is the number of paths inP_(t)(i,j) with l self-loops. Substituting equation (10) into equation(9) and using the union bound, equation (9) can be upper bounded as,

$\begin{matrix}{{P\lbrack {{\begin{matrix}\max & \max \\l & \;\end{matrix}( {\frac{s_{1}^{(l)}}{t},\ldots\mspace{14mu},\frac{s_{n_{l}}^{(l)}}{t}} )} > x} \rbrack} \leq {\sum\limits_{l = 0}^{t}{\sum\limits_{q = 1}^{n_{l}}{P\lbrack {{\frac{1}{t}S_{q}^{(l)}} > x} \rbrack}}}} &  11 )\end{matrix}$

Since S_(q) ^((l)) are sum of (t−l) i.i.d random variables, S_(q) ^((l))is i.i.d in q for a fixed l, but differently distributed for differentl, so the index q can be dropped and the sum over q can be replaced withn_(l) to get,

$\begin{matrix}{{{P\lbrack {{\frac{1}{t}M_{t}^{({i,j})}} > x} \rbrack} \leq {\sum\limits_{l = 0}^{t}{n_{l} \cdot {P\lbrack {{\frac{1}{t}S^{(l)}} > x} \rbrack}}}},{= {\sum\limits_{l = 0}^{t}{{\begin{pmatrix}t \\l\end{pmatrix}\lbrack A^{t - 1} \rbrack}_{i,j}{P\lbrack {{\frac{1}{t}S^{(l)}} > x} \rbrack}}}}} &  12 )\end{matrix}$

From Lemma1, [A^(t−l)]i,j≤p^(t−l) and letting

$l^{*} = {\begin{matrix}{argmax} \\l\end{matrix}\begin{pmatrix}t \\l\end{pmatrix}p^{t - l}{P\lbrack {\frac{S^{(l)}}{t} > x} \rbrack}}$

in equation (12),

$\begin{matrix}{{P\lbrack {{\frac{1}{t}M_{t}^{({i,j})}} > x} \rbrack} \leq {( {t + 1} )\begin{pmatrix}t \\l^{*}\end{pmatrix}p^{t - l^{*}}{P\lbrack {\frac{S^{(l^{*})}}{t} > x} \rbrack}}} &  13 )\end{matrix}$

$P\lbrack {\frac{S^{(l^{*})}}{t} > x} \rbrack$can be rewritten as

${P\lbrack {\frac{S^{(l^{*})}}{t - l^{*}} > {\frac{t}{t - l^{*}}x}} \rbrack}.$In the next step,the second term can be bounded on RHS of equation (13) by the Chernoffbound as,

${P\lbrack {\frac{S^{(l^{*})}}{t - l^{*}} > {\frac{t}{t - l^{*}}x}} \rbrack} = e^{{- {t{({1 - \alpha})}}}{I{(\frac{x}{1 - \alpha})}}}$

where I(x) is the large deviation rate function and α=l*/t. For large t,

${\begin{pmatrix}t \\{\alpha t}\end{pmatrix} = e^{({t{({{H{(\alpha)}} + {\circ {(1)}}})}})}},$αt where H(α)=−α log(α)−(1−α)log(1−α). For convenience let β=1−α, thenequation (13) reduces to,

$\begin{matrix}{{P\lbrack {{\frac{1}{t}M_{t}^{({i,j})}} > x} \rbrack} \leq {( {t - 1} )e^{t{({{H{(\beta)}} + {{\beta log}{(\rho)}} - {\beta\;{I{({x/\beta})}}} + {\circ {(1)}}})}}}} &  14 )\end{matrix}$

It is well-known that the large-deviation rate function I(·) ismonotonically increasing to infinity for arguments restricted above themean of the random variable (zero-mean noise in this case), so theexponent in equation (14) will be negative when x is large enough. Hencethe smallest x for which equation (14) goes to zero exponentially isgiven by equation (6). This concludes the proof of the Theorem ▪.

Simplified upper bound for Gaussian noise: If the noise is Gaussian, i.ev_(ij)˜

(0,1), then

${I(x)} = \frac{x^{2}}{2}$in equation (6). Using algebra, equation (6) simplifies as,

$\begin{matrix}{\lambda \leq {\begin{matrix}\sup \\{0 \leq \beta \leq 1}\end{matrix}\sqrt{2{\beta( {{H(\beta)} + {\beta{\log(\rho)}}} )}}}} &  15 )\end{matrix}$

Defining g(β)=√{square root over (2β(H(β)+β log(p)))}, the supremum willbe achieved for β that satisfies

${\frac{\partial{g(\beta)}}{\partial\beta} = 0},$which simplifies to

$\rho = {\sqrt{\frac{\beta}{1 - \beta}}e^{- \frac{H{(\beta)}}{2\beta}}}$

Note that, I(·) is a convex function and as ρ increases β will approachits upper limit of 1. Therefore, it can be concluded that for graphswith large ρ, the optimal value of β→1, hence it can be written that,H(β)+β log(ρ)−βI(x/β)≈log(ρ)−I(x)  16)

which is negative when I(x)>log(p).

This behavior of β may be established for the Gaussian case. Howeverthis holds more generally. Since f(x,β)=H(β)+β log(ρ)−βI(x/β) is concavein β for every x, this must only be checked when x>0, the β* that solves

$\frac{\partial{f( {x,\beta^{*}} )}}{\partial\beta} = 0$approaches 1 as log(ρ) increases. Setting the derivative to 0:

$\log( {{\frac{1 - \beta}{\beta} + {\log(\rho)} - {I( {x/\beta} )} + {\frac{x}{\beta}{I^{\prime}( {x/\beta} )}}} = 0} )$

One can check that as ρ increases, log(ρ)→∞ and hence,

$ {\log( \frac{1 - \beta}{\beta} )}arrow{- \infty} $which is reached as β→1. This shows that as ρ increases, β=1 for generalnoise distributions as well.

Alternative upper bound: Recall that, while proving Theorem 1, the pathfrom node i to j in t steps, whose sum was the maximum among allpossible paths, was of interest. To achieve this, first the number ofpaths from node i to j in t steps were counted and then, these pathswere grouped in terms of number of paths that involved self-loops. Notethat, self-loops were not affected by noise so their contribution to thesum along the path is 0. The analysis would be simpler if noise on selfloops was considered, thereby eliminating the need to count and groupthe paths by number of self loops involved. So considering noise onself-loops, which is equivalent to setting β=1 in Theorem 1, wouldresult in the following recursion,

$\begin{matrix}{{x_{i}( {t + 1} )} = {\max( {{{x_{i}(t)} + {v_{ii}(t)}},{\begin{matrix}\max \\{i\epsilon\mathcal{N}_{i}}\end{matrix}( {{x_{j}(t)} + {v_{ij}(t)}} )}} )}} &  17 )\end{matrix}$

instead of equation (1). Note that, equation (17) is not the proposedmax consensus scheme, but an auxiliary recursion used here to upperbound the growth rate. It can be observed that x_(ij)(t+1) is convex inv_(ii)(t), and due to Jensen's inequality the additional noise inequation (17) can only increase the slope λ compared to equation (1).Hence, the growth rate of equation (1) is upper bounded by that ofequation (17). Repeating the proof of Theorem 1 for this case amounts toreplacing A by A+I and therefore ρ with ρ+1, so the following is true:

Theorem 2. The auxiliary recursion in equation (17) has a growth rateupper bounded by the value of x>0 that solves,I(x)=log(ρ+1)  18)

where I(x) is the large deviation rate function. Moreover, this value ofx upper bounds the growth rate λ of the recursion in equation (1).

Note that, for Gaussian noise distribution the alternative upper boundon the growth rate can be calculated as,λ≤√{square root over (2 log(p+1))}  19)

While equation (19) is a looser bound than equation (15), it is muchsimpler. As ρ increases, i.e. as β→1, alternative upper bound and exactupper bound converge.

Lower Bound

While it is clear that λ≥0, it is not obvious when λ≥0. In this section,lower bound is derived, which, in part, shows that there exists a growthrate Δ due to additive noise in the network, which is always positive(λ.0). Also, the lower bound relates to the order statistics of theunderlying noise distribution as well as the steady state distributionof the underlying Markov chain.

Lower bound for regular graphs: Recall that the state of the i^(th)sensor at time t+1 is given by the i^(th) element of the vector,x(t+1)=W(t,0)⊗x(0) which is,

$\begin{matrix}{{{x_{i}( {t + 1} )} = {\begin{matrix}\max \\j\end{matrix}( {\lbrack {W( {t,0} )} \rbrack_{i,j} + {x_{j}(0)}} )}},{\geq {\begin{matrix}\max \\j\end{matrix}( {\lbrack {{W( {t,0} )}_{i,j} + {x_{\min}(0)}} ),} }}} &  20 )\end{matrix}$

Where

${x_{\min}(0)} = {\begin{matrix}\min \\i\end{matrix}{{x_{i}(0)}.}}$Now, using equation (20), the growth rate λ is lower bounded as,

$\begin{matrix}{\lambda = {{\begin{matrix}\lim \\ tarrow\infty \end{matrix}\frac{x_{i}(t)}{t}} = {{\begin{matrix}\lim \\ tarrow\infty \end{matrix}\frac{x_{i}( {t + 1} )}{t}} \geq {{\begin{matrix}\lim \\ tarrow\infty \end{matrix}\frac{1}{t}{\begin{matrix}\max \\j\end{matrix}\lbrack {W( {t,0} )} \rbrack}_{i,j}} + {\begin{matrix}\lim \\ tarrow\infty \end{matrix}\frac{1}{t}{x_{\min}(0)}}}}}} &  {21a} )\end{matrix}$ $\begin{matrix}{\geq {\begin{matrix}\lim \\ tarrow\infty \end{matrix}\frac{1}{t}{\sum\limits_{k = 0}^{t - 1}\lbrack {W(k)} \rbrack_{{p(k)},{p({k + 1})}}}}} &  {21b} )\end{matrix}$

where equation (21a) is due to equation (20) and in equation (21b),{p(k)}_(k=0) ^(t) is any path that satisfies p(0)=j and p(t)=i. In orderto get a good lower bound, evaluating equation (21b) is relied upon fora specific path defined as,

$\begin{matrix}{{p( {k + 1} )} = {{\begin{matrix}{argmax} \\{m \in {{\mathcal{N}( {p(k)} )}\bigcup{p(k)}}}\end{matrix}\lbrack {W(k)} \rbrack}_{{p(k)},m}.}} &  22 )\end{matrix}$

This amounts to selecting the locally optimum or greedy path. If thegraph is d−regular, then with p(k) chosen as in equation (22), therandom variables in equation (21b) are distributed the same as themaximum of d i.i.d random variables and zero, whose expectation isdenoted as m₊(d). Therefore, due to law of large numbers, equation (21b)converges to,

$\begin{matrix}{{\lambda \geq {m_{+}(d)}} = {{{\mathbb{E}}\lbrack {{\max( {0,\begin{matrix}\max \\m\end{matrix}} )}\lbrack {W(k)} \rbrack}_{{p(k)},m} \rbrack} = {d{\int_{0}^{\infty}{{{xF}^{d - 1}(x)}{f(x)}{dx}}}}}} &  23 )\end{matrix}$

where F(·) and f(·) are the CDF and PDF of the noise respectively. Also,one can lower bound growth rate with a simple expression given by,

${\lambda \geq {F^{- 1}( \frac{d}{d + 1} )}},$provided that median of noise samples are zero.

Lower bound for irregular graphs: For irregular graphs, the path definedin equation (22) is a random walk on the graph with the correspondingsequence of nodes constituting a Markov chain. When the graph isirregular, the transition probabilities of this Markov chain depend onthe degree of the current node. Specifically, the transition probabilitymatrix is given by,P=(1−k)D ⁻¹ A+kI

where the diagonal matrix [D]_(i,j)=d_(i), degree of node i, so

$\begin{matrix}{\lbrack P\rbrack_{i,j} = \begin{Bmatrix}{{{\frac{1 - k}{d_{i}}i} \neq j},} & {( {i,j} ) \in \mathcal{E}} \\{{{ki} = j},} & \end{Bmatrix}} &  24 )\end{matrix}$

where k is the probability that noise samples on neighboring edges ofnode i are negative, given byk=P[[W(k)]_(i,j)<0,∀j]=d _(i)∫_(−∞) ⁰ F ^(d) ^(i) ⁻¹(x)f(x)dx

Let the steady state probabilities of this Markov chain be denoted byπ_(i). Then, using the law of large numbers the lower bound is given by,

$\begin{matrix}{{{\begin{matrix}\lim \\ tarrow\infty \end{matrix}\frac{1}{t}{\sum\limits_{k = 10}^{t - 1}{\begin{matrix}\max \\m\end{matrix}( \lbrack {W(k)} \rbrack_{{p(k)},m} )}}} = {{\sum\limits_{i = 1}^{N}{\pi_{i}m}} + ( d_{i} )}},} &  25 )\end{matrix}$

since the random variable

$\begin{matrix}\max \\m\end{matrix}( \lbrack {W(k)} \rbrack_{{p(k)},m} )$has expectation m+ (d_(i)), given node i. One can find a closed formexpression for π_(i) as

${\pi_{i} = \frac{d_{i}}{2E}},$where E:=|⊗| is the total number of edges in the network. To verifythis, one can check that π^(T)P=π^(T), where π^(T)=[π₁, . . . , π_(N)],using equation (24). In conclusion, the lower bound on the growth ratefor irregular graphs is given by,

$\begin{matrix}{\lambda \geq {{\sum\limits_{i = 1}^{N}{\frac{d_{i}}{2E}m}} + {( d_{i} ).}}} &  26 )\end{matrix}$Bounds on Growth Rate for Random Graphs

In this section the case is considered where each edge is absent by aprobability of p, independently across edges and time, which modelsrandom transmission failures.

Upper Bound for Random Graphs

It is now shown that the upper bound on growth rate for the randomlychanging graphs can be simply obtained by replacing p in the fixed graphcase by ρ(1−p) in equation (6), where p is the Bernoulli probability,that any edge will be deleted independently at each iteration.

Recall that in fixed graph model, W(k) had zero (e) along the diagonaland [W(k)]_(l,m)=v_(lm)(k) was the underlying i.i.d noise randomvariables when (l,m)∈ε. The random graph can be described as,

$\begin{matrix}{\lbrack {W(k)} \rbrack_{lm} = \{ \begin{matrix}{{{v_{l,m}(k)}{with}{prob}( {1 - p} ){if}( {l,m} )} \in \varepsilon} \\{{{- C}{with}{prob}p{if}( {l,m} )} \in \varepsilon} \\{{el} = m} \\{{\varepsilon{if}( {l,m} )} \notin \varepsilon}\end{matrix} } &  27 )\end{matrix}$

where C is a large positive constant which captures randomly absent edgeas C→∞. Note that, since each node is maxing with itself at eachiteration in equation (1), the large negative value of −C, will neverpropagate through the network, which is equivalent to deleting an edge,for large C.

Following the analysis of the fixed graph case, only the momentgenerating function of the noise samples changes to,M(γ,C)=pe ^(−Cγ)+(1−p)M(γ),

where M(γ) is the original moment generating function of the noisesamples given by M(γ)=

[e^(γvij(k))]. The corresponding rate function is given by

${l( {x,C} )} = {\begin{matrix}\sup \\{\gamma > 0}\end{matrix}( {{x\gamma} - {\log( {M( {\gamma,C} )} )}} )}$

Following the proof of Theorem 1, to upper bound the growth for thiscase the smallest x is found that satisfies,

$\begin{matrix}\lim \\ Carrow\infty \end{matrix}\begin{matrix}\sup \\{0 \leq \beta \leq 1}\end{matrix}( {{{H(\beta)} + {\beta{\log(\rho)}} - {\beta{I( {\frac{x}{\beta},C} )}}} < 0} )$

Consider

${{f( {x,\beta,C} )} = {{H(\beta)} + {\beta{\log(\rho)}} - {\beta{I( {\frac{x}{\beta},C} )}}}},$since f(x,β,C) isconvex in C and concave in β it can be written that.

$\begin{matrix}{{{\begin{matrix}\inf \\c\end{matrix}\begin{matrix}\sup \\{0 \leq \beta \leq 1}\end{matrix}{f( {x,\beta,C} )}} = {{\begin{matrix}\sup \\{0 \leq \beta \leq 1}\end{matrix}\begin{matrix}\inf \\C\end{matrix}{f( {x,\beta,C} )}} = \ {{\begin{matrix}\sup \\{0 \leq \beta \leq 1_{C}}\end{matrix}\begin{matrix}\lim \\ arrow\infty \end{matrix}{f( {x,\beta,C} )}} = {\begin{matrix}\sup \\{0 \leq \beta \leq 1}\end{matrix}( {{{H(\beta)} + {{\beta log}( {\rho( {1 - p} )} )} - {\beta{I( \frac{x}{\beta} )}}} < 0} )}}}},} &  28 )\end{matrix}$

where the first equality is due to classical minimax theorem, and seconddue to the monotonicity of f(x,β,C) in C. Hence, the upper bound can bewritten as,

$\begin{matrix}{\lambda \leq {\inf\{ {x\ :\ {\begin{matrix}\sup \\{0 \leq \beta \leq 1}\end{matrix}\lbrack {{{H(\beta)} + {\beta{\log( {p( {1 - p} )} )}} - {\beta{I( \frac{x}{\beta} )}}} < 0} \rbrack}} \}}} &  29 )\end{matrix}$

Interestingly, this is precisely the upper bound for fixed graphs exceptthat ρ(1−p) instead of ρ. While for a fixed graph ρ≤1 always holds, inrandom graphs case it is possible to have ρ(1−p)<1. If ρ(1−p)≈0 then itis easy to check in equation (29) that the optimizing β is near zero.This can be contrasted with the case where ρ is large and the optimizingβ was found to satisfy β≈1.

Lower Bound for Random Graphs

Here, the lower bound on the growth rate for randomly changing graphs isderived. Recall that, for the path defined in equation (22), and whenW(k) is as defined in equation (27), yields a lower bound on the growthrate, for graphs with edge deletion probability of p.

Compared to equation (26), the only difference in the derivation isthat, the node i will now have a random degree Z_(i), which is binomialwith parameters (d_(i),1−p). Due to law of large numbers, equations(25)-(26) have an additional expectation with respect to this binomialdistribution, resulting in following expression,

$\begin{matrix}{{{\lambda \geq {\sum\limits_{i = 1}^{N}{\pi_{i}{{\mathbb{E}}\lbrack {m + ( Z_{i} )} \rbrack}}}} = {\sum\limits_{i = 1}^{N}{\frac{d_{i}}{2E}{\sum\limits_{k = 0}^{d_{i}}{\begin{pmatrix}d_{i} \\k\end{pmatrix}{p^{d_{i} - k}( {1 - p} )}^{k}{m_{+}(k)}}}}}}} &  30 )\end{matrix}$

Note that, in equation (30), π_(i)=d_(i)/2E still holds, since thetransition probabilities of the Markov chain are still of the form as inequation (24).

Upper Bound on Growth Rate without Calculating I(x)

In this section, a technique is presented to directly calculate theupper bound on growth rate using the moment generating function, withouthaving to compute the large deviation rate function of the additivenoise distribution.

Recall that the upper bound on growth rate is given by equation (29)where, p=0 for fixed graphs. For convenience, let K

ρ(1−p) and f(β, x)=H(β)+β log(K)−βI(x/β). Since,

${{I(x)} = {\begin{matrix}\sup \\{\gamma > 0}\end{matrix}( {{x\gamma} - {\log{M(\gamma)}}} )}},$it can be written,

$\begin{matrix}{{\begin{matrix}\sup \\{0 \leq \beta \leq 1}\end{matrix}{\overset{\_}{f}( {\beta,x} )}} = {\begin{matrix}\inf \\{\gamma > 0}\end{matrix}\begin{matrix}\sup \\{0 \leq \beta \leq 1}\end{matrix}( {{H(\beta)} + {{\beta log}(K)} - {x\gamma} + {\beta\log{M(\gamma)}}} )}} &  31 )\end{matrix}$

where minimax theorem is used to interchange the infimum and supremum,since log M(γ) is always convex. The inner supremum can be solved inclosed form as,

$\begin{matrix}{{\beta^{*} = \frac{K{M(\gamma)}}{1 + {K{M(\gamma)}}}},} &  32 )\end{matrix}$

which yields,

${\begin{matrix}\sup \\{0 \leq \beta \leq 1}\end{matrix}{\overset{\_}{f}( {\beta,x} )}} = {\begin{matrix}\inf \\{\gamma > 0}\end{matrix}{( {{H( \beta^{*} )} + {\beta^{*}{\log( {{KM}(\gamma)} )}} - {x\gamma}} ).}}$

So,

$\begin{matrix}{{\begin{matrix}\inf \\x\end{matrix}\{ {x:{{\begin{matrix}\sup \\{0 \leq \beta \leq 1}\end{matrix}{\overset{\_}{f}( {\beta,x} )}} < 0}} \}} = {\begin{matrix}\inf \\{\gamma > 0}\end{matrix}( {{\frac{1}{\gamma}{H( \beta^{*} )}} + {\frac{\beta^{*}}{\gamma}{\log( {K{M(\gamma)}} )}}} )}} &  33 )\end{matrix}$

Note that β* is also a function of γ. This technique is very useful tocalculate growth rate, when I(x) is difficult to evaluate, orunavailable.

Empirical Upper Bound on Growth Rate

In this Section, an empirical correction factor to the upper bound isproposed which improves the tightness of the bound, for all networksettings and noise distributions. In order to improve the tightness ofthe upper bound, a correction factor ϕ is introduced to the upper boundin equation (6). The correction factor ϕ depends only on number of nodesN in the network, given by,

$\begin{matrix}{{\phi = {1 - \frac{1}{2\sqrt{N}}}},} &  34 )\end{matrix}$

and multiplies the upper bound in equation (6).

While there may be no proof that this correction will always yield anupper bound, the choice of ϕ was empirically validated over differentgraph topologies and noise distributions, and in all settings, ϕimproved the tightness of the bound. The intuition is that theapproximations made in deriving the upper bound leads to a minordeviation in the tightness for smaller N, which can be fixed by ϕ. Notethat, as N→∞, the compensation variable ϕ→1, hence ϕ mainly contributesfor graphs with smaller number of nodes.

The tightness of upper bound in equation (6) is compared to theempirical bound, illustrating the accuracy of the correction factor ϕ.

Algorithm 1 Robust Max consensus Algorithm  1: First run ::  2:  Input:iterations = t, # of nodes = N  3:  Initialization  4:   Initialize allnodes to zero, x_(i)(0) = 0  5:  repeat until : t_(max) iterations  6:  for {i = 1 : N}  7:   ${x_{i}(t)} = {\max( {{x_{i}(t)},{\max\limits_{j \in \mathcal{N}_{i}}( {{x_{j}( {t - 1} )} + {v_{ij}( {t - 1} )}} )}} )}$ 8:   end : for  9:  end : repeat 10:  ${{growth}\mspace{14mu}{rate}\mspace{14mu}{estimate}\text{:}\mspace{14mu}{{\hat{\lambda}}_{i}( t_{\max} )}} = \frac{x_{i}( t_{\max} )}{t_{\max}}$11: Second run :: 12:  Input: # of nodes = N, Initial state: x_(i)(0)13:  repeat until : convergence 14:  for {i = 1 : N} 15:${x_{i}(t)} = {{\max( {{x_{i}(t)},{\max\limits_{j \in \mathcal{N}_{i}}( {{x_{j}( {t - 1} )} + {v_{ij}( {t - 1} )}} )}} )} - {{\hat{\lambda}}_{i}( t_{\max} )}}$16:   end : for 17:  end : repeatRobust Max Consensus Algorithm

Max consensus algorithms in existing works fail to converge in thepresence of noise, as there is no compensation for the positive driftinduced by the noise. Some works develop a soft-max based averageconsensus (SMA) approach to approximate the maximum and compensate forthe additive noise. However, those algorithms are sensitive to a designparameter, which controls the trade off between estimation error andconvergence speed. So, a fast max-based consensus algorithm is developedin this section, which is informed by the fact that there is a constantslope λ, analyzed in the previous sections, which can be estimated andremoved. This makes the algorithm robust to the additive noise in thenetwork.

If the knowledge of the spectral radius of the network and noisevariance is known, then by using Theorem 1, one can closely estimate thegrowth rate and subtract this value at each node after the node update.However, the noise variance and the spectral radius are not always knownlocally at each node. Hence, a fast max consensus algorithm generalizedto unknown noise distributions is proposed, as described in Algorithm 1,where slope is being locally estimated at each node. The variance ofthis estimator is also analyzed.

The algorithm consists of two runs, where in the first run, the statevalues of all the nodes are initialized to zero and run the maxconsensus algorithm in the additive noise setting. This can be performedby a simple reset operation, which is available at every node and theninitiate the conventional max consensus algorithm. Note that, in thiscase the true maximum is zero, but due to the additive noise, the statevalues grow at the rate of λ. The growth rate estimate for node i isdenoted by {circumflex over (λ)}_(i), is computed locally over t_(max)iterations as,

$\begin{matrix}{{{{\hat{\lambda}}_{i}( t_{\max} )} = {\frac{1}{t_{\max}}{x_{i}( t_{\max} )}}},} &  35 )\end{matrix}$

the average increment in the state value of node i. Note that, thisestimation is done locally at every node. Also, the algorithm ismemory-efficient, since the history of state values is not used, andonly the information of the iteration index and the current state valueis needed to estimate the growth rate.

In the second run, max consensus algorithm is run on the actualmeasurements to find the maximum of the initial readings. The growthrate estimate {circumflex over (λ)}_(i) is used to compensate for theerror induced by the additive noise as given in line (15) ofAlgorithm 1. Note that, the estimator is independent of the type ofadditive noise distribution.

To clarify, FIG. 2 shows a methodology 200 for determining max consensusin a wireless distributed network. At block 202, a network is providedincluding N sensor nodes, each sensor node i assuming an assigned ormeasured state value x_(i)(t) and each connection between neighboringsensor nodes i and j assuming additive noise v_(i,j)(t), where i,j∈n. Atblock 204, a growth rate estimate A associated with each respectivesensor node i is determined. Block 204 spawns three sub-blocks 242, 244and 246; at block 242 the state value x_(i)(t) of all sensor nodes areinitialized to zero such that x_(i)(0)=0. This step is crucial as itallows direct measurement of sensor drift; any nonzero state valueupdated in further iterations based on x_(i)(0)=0 is due to sensordrift, network anomalies or noise from neighboring channels, allowingthe system to fully characterize the growth rate estimate in thepresence of zero sensor measurement. At block 244, the state valuesx_(i)(t) are updated for each sensor node i with local maximum fort_(max) iterations such that

${x_{i}(t)} = {{\max( {{x_{i}(t)},{\max\limits_{j \in \mathcal{N}_{i}}( {{x_{j}( {t - 1} )} + {v_{ij}( {t - 1} )}} )}} )}.}$This step mirrors a typical max consensus algorithm, however asdiscussed above the initial sensor node state values were set to zero.At block 246, the growth rate estimate λ_(i) is determined such that

${{\hat{\lambda}}_{i}( t_{\max} )} = {\frac{x_{i}( t_{\max} )}{t_{\max}}.}$In a perfect world with no communication noise, sensor drift or othernetwork anomalies, x_(i)(t_(max)) when x_(i)(0)=0 would hypotheticallybe zero, as the maximum state value held by the sensor nodes from timet=0 would always be zero. Since this is not the case, nonzerox_(i)(t_(max)) is a false value due to sensor drift, communicationnoise, or other network anomalies.

Once the growth rate estimate λ_(i) has been determined, a true statevalue maximum for each node can be estimated by running the iterativelyupdating max consensus methodology again, however this time with trueinitial sensor state values and by removing the growth rate estimatefrom each state value at each iteration. This is shown in block 206,which includes determining a true state value maximum for eachrespective sensor node i of the plurality of N sensor nodes for eachiteration t of a plurality of t_(max) iterations to generate a set oftrue state value maxima. At sub-block 262 of block 206, the initialstate values x_(i)(0) are directly measured by each sensor node i. Atsub-block 264 of block 206, each sensor node i is updated with a localmaximum for t_(max) iterations and subtracting growth rate estimate Asuch that

${x_{i}(t)} = {{\max( {{x_{i}(t)},{\max\limits_{j \in \mathcal{N}_{i}}( {{x_{j}( {t - 1} )} + {v_{ij}( {t - 1} )}} )}} )} - {{{\hat{\lambda}}_{i}( t_{\max} )}.}}$This step yields a set of true state value maximums with growth ratesdue to noise removed, one for each node at each iteration. At block 208,a final state value maximum is selected from the set of true state valuemaximums, the final state value maximum being the maximum of the set oftrue state value maximums.

Performance Analysis

To address the accuracy of the estimate in equation (35) over a finitenumber of iterations, Efron-Stein's inequality is used to show that thevariance of the growth rate estimator {circumflex over (λ)}_(i)(t_(max)) decreases as

(t_(max)), where t_(max) is number of hops. For completeness, theEfron-Stein inequality is introduced in the following theorem.

Theorem 3. Let X₁,X₂, . . . , X_(n), be independent random variables andlet X be an independent copy of X_(q), for q≥1. Let Z=f(X₁,X₂, . . . ,X_(q), . . . , X_(n)) andZ _(q) ′=f(X ₁ ,X ₂ , . . . , X _(q−1) ,X _(q) ,X _(q+1) , . . . , X_(n))

then

${{{Var}(Z)} \leq {\sum\limits_{q = 1}^{n}{{\mathbb{E}}\lbrack ( ( {Z - Z_{q}^{1} +} )^{2} ) \rbrack}}},{{{where}( {Z - Z_{q}^{\prime}} )_{+}} = {{\max( ( {0,{Z - Z_{q}^{\prime}}} ) )}.}}$

The following theorem bounds the variance of the growth rate estimator.

Theorem 4. The Variance of the growth rate estimator {circumflex over(λ)}_(i) (t_(max)) satisfies,

$\begin{matrix}{{{{Var}( {{\hat{\lambda}}_{i}( t_{\max} )} )} \leq \frac{\sigma^{2}}{t_{\max}}},} &  33 )\end{matrix}$

where t_(max) is number of iterations and σ²=Var(v_(ij)(t)).

Proof: Using equation (35), and recalling from Theorem 1, the expressionfor x_(i)(t_(max)) with zero initial conditions x_(i)(0)=0 is

$\begin{matrix}{{{\hat{\lambda}}_{i}( t_{\max} )} = {\frac{1}{t_{\max}}{( {\max\limits_{{\{{p(k)}\}} \in {\frac{\bigcup}{j}{{Pt}_{\max}({i,j})}}}{\sum\limits_{k = 0}^{t_{\max}}\lbrack {W(k)} \rbrack_{{p(k)},{p({k + 1})}}}} ).}}} &  36 )\end{matrix}$

Next, Theorem 3 is used to bound the variance of equation (36). Forsimplicity of notation, set Z={circumflex over (λ)}_(i) (t_(max)), whichdepends on noise samples v_(ij)(t) through W(k) in equation (36). So theindependent random variables X={X₁,X₂, . . . , X_(n)} in Theorem 3correspond to re-indexing of v_(ij) (t), with n denoting the totalnumber of noise samples that influence {circumflex over (λ)}_(i)[t_(max)], which is approximately n≈(t_(max)+1), where E=|ε| is thetotal number of edges (the exact value of n depends on the graphtopology). Z_(q) is set to be given by equation (36) when the noisesample v_(ij)(t) corresponding to X_(q) is replaced by an independentcopy X_(q). Note that the path that maximizes equation (36) correspondsto a subset

(

) of {1, . . . , n}, with t_(max) elements.

If q, ∉

(

) then the maximal path is un-affected, so Z−Z_(q)′≤0 and (Z−Z_(q)′)=0.Hence, the analysis is simplified by considering only q, ∉

(

), so that Theorem 3 can be simplified from involving n terms in theupper bound to only t_(max) terms:

$\begin{matrix}{{{{Var}(Z)} \leq {{\mathbb{E}}_{\mathcal{X}}\lbrack {\sum\limits_{q \in {\mathcal{M}(\mathcal{X})}}{{\mathbb{E}}\lbrack {( ( {Z - Z_{q}^{\prime}} )_{+} )^{2}{❘\mathcal{X}❘}} \rbrack}} \rbrack}},\text{ }{= {{\mathbb{E}}_{\mathcal{X}}\lbrack {{\sum\limits_{q \in {\mathcal{M}(\mathcal{X})}}{{{\mathbb{E}}\lbrack {( ( {Z - {Z_{q}^{\prime}}_{+}} ) )^{2}{❘{X_{q} \geq X_{q}^{\prime}}❘}\mathcal{X}} \rbrack}{P\lbrack {( {X_{q} \geq X_{q}^{\prime}} )❘\mathcal{X}} \rbrack}}} + {\sum\limits_{q \in {\mathcal{M}(\mathcal{X})}}{{{\mathbb{E}}\lbrack {( ( {Z - {Z_{q}^{\prime}}_{+}} ) )^{2}{❘{X_{q} < X_{q}^{\prime}}❘}\mathcal{X}} \rbrack}{P\lbrack {( {X_{q} < X_{q}^{\prime}} )❘\mathcal{X}} \rbrack}}}} \rbrack}},} &  37 )\end{matrix}$

where the equality is due to the total expectation theorem. Note that,for q∈

(

) and X_(q)<X_(q)′, the maximal path remains the same and (Z−Z_(q)′)₊=0.Using P[(X_(q)≥X_(q)′)|

]=½ in equation (37) reduces to,

$\begin{matrix}{{{{Var}(Z)} \leq {\frac{1}{2}{\mathbb{E}}_{\mathcal{X}}{\sum\limits_{q \in {\mathcal{M}(\mathcal{X})}}{{\mathbb{E}}\lbrack {( ( {Z - Z_{q}^{\prime}} )_{+} )^{2}{❘{X_{q} \geq X_{q}^{\prime}}❘}\mathcal{X}} \rbrack}}} \leq {\frac{1}{2}{\mathbb{E}}_{\mathcal{X}}{\sum\limits_{q \in {\mathcal{M}(\mathcal{X})}}{{\mathbb{E}}\lbrack {( {\frac{1}{t_{\max}}( {X_{q} - X_{q}^{\prime}} )_{+}} )^{2}{❘{X_{q} \geq X_{q}^{\prime}}❘}\mathcal{X}} \rbrack}}}}\ } &  38 )\end{matrix}$

where Z−Z_(q)′=(X_(q)−X_(q)′)/t_(max) is used, if the maximal path doesnot change when X is substituted for X_(q); if on the other hand themaximal path changes then, Z−Z_(q)′≤(X_(q)−X_(q)′)/t_(max), which can beverified by considering a substitution of X_(q)′ in the original pathwhich is smaller than Z_(q)′. It is straightforward to show that the RHSof equation (38) is given by σ²/t_(max), which concludes the proof.

In order to bound the variance of the max-consensus algorithm, Theorem 4is used to write x_(i)(t) in the first run of the algorithm with zeroinitial measurements as,x _(i)(t)=λt+σ√{square root over (tY _(t))}  39)

where Y_(t) is an auxiliary random variable with Var(Y_(y))≤1, which isclearly equivalent to Theorem 4 after using {circumflex over (λ)}_(i)(t_(max))=x_(i)(t)/t.

In the second run of the algorithm after D iterations, where D is thediameter of the network, all nodes converge on the maximum of theinitial measurements. Hence the estimator

_(i) (t_(max)) can be written as,x _(i)(D)=(λ−{circumflex over (λ)}_(i)(t _(max)))D+σ√{square root over(DY _(D) +x _(max)(0))},   40)

where it is known that,

$\begin{matrix}{{{\hat{\lambda}}_{i}( t_{\max} )} = {\lambda + {\frac{\sigma}{\sqrt{t_{\max}}}V_{t_{\max}}}}} &  41 )\end{matrix}$

where V_(tmax) is an auxiliary random variable with Var(V_(tmax))≤1.Since the two runs involve independent noise samples, substitutingequation (41) into equation (40) gives,

$\begin{matrix}{{{Var}( {x_{i}(D)} )} \leq {{\sigma^{2}( {\frac{D^{2}}{t_{\max}} + D} )}.}} &  42 )\end{matrix}$

This shows that the variance of the estimator scales linearly with thediameter of the network, as long as t_(max) also scales linearly with D.

Simulation Results

A distributed network with N=75 nodes is considered, as shown in FIG. 3.This irregular graph was randomly generated, which is commonly followed.The spectral radius of the graph generated was computed to be ρ=30.56.Two graph topologies are considered for the simulations:

-   -   Fixed graphs: by selecting p=0 as in FIG. 3.    -   Time-varying graphs (Random graphs): by selecting p=0.5.

Communication links between any two nodes has a noise componentdistributed as

(0,1). First, all nodes are initialized to 0 and the max consensusalgorithm is run to estimate growth rate, [t_(max)] as in line 10 of thealgorithm. Note that, following results are Monte-Carlo averaged over500 iterations.

Efficiency of the Bounds

For fixed graphs, the upper bound given by equation (15), empiricalupper bound, lower bound given by equation (26), and the Monte-Carloestimate of max consensus growth is plotted for every node, labeled as“True max-consensus growth” in FIG. 4, and compared. It is observed inFIG. 4 that the empirical upper bound is much tighter than the originalupper bound.

The same experiment was repeated on a random graph, which was obtainedby randomly deleting each edge of the graph in FIG. 3 with probabilityp=0.5. The comparison of the upper bound, given by equation (28),empirical upper bound, lower bound given by equation (30), and the trueMonte-Carlo estimate of the max consensus growth is shown in FIG. 5.Note that, not only the empirical upper bound is tight for time-varyinggraphs, but it is also generalizable for different graph topologies.

Next, simulations are run for non-Gaussian distributions such as Laplaceand Uniform distributions to verify the tightness of upper bound. InFIGS. 6-7, the performance of upper bound and empirical upper bound iscompared for network in FIG. 3 with N=75, where the noise on the linksare sampled from Laplace and continuous uniform distributions,respectively. The parameters of Laplace distribution L(μ,b) were chosenas μ=0 and b=√{square root over (2)}, and uniform distribution U(a.b) asU(−√{square root over (3)}, √{square root over (3)}), to ensure zeromean and unit variance. Results also show that the empirical upper boundholds good for general noise distributions. Since Laplace distributionis heavy-tailed compared to Gaussian and uniform, it has a larger growthrate.

Performance of the Algorithms

The performance of conventional max consensus algorithms and theproposed algorithm is compared, subjected to additive Gaussian noise

(0,1). In order to represent the actual sensor measurements, for bothfixed and random graphs, a synthetic dataset with nodes initialized withvalues over (100, 200) is considered, where the true maximum of theinitial state values is 200. The robust max consensus algorithm given inAlgorithm 1 is run over these initial measurements on both the graphs.The results are Monte-Carlo averaged over 500 iterations.

For fixed graphs, performance of the robust max consensus algorithm andthe existing max based consensus algorithm is shown in FIG. 8. It can beobserved that the conventional max consensus algorithm diverges as tincreases, whereas the algorithm does not suffer from increasing linearbias. Even in case of random graphs, the algorithm converges to the truemaximum, whereas the conventional max consensus algorithm diverges as tincreases, as shown in FIG. 7.

By comparing the dynamic range of growth rate of conventional maxconsensus algorithms in FIG. 8 and FIG. 9, it is observed that a) att=30, state values over fixed graphs has mean and standard deviation of270.39 and 0.6966 respectively, and b) at t=30, state values over randomgraphs with p=0.5 has mean and standard deviation of 261.09 and 0.9233,respectively. Thus, node state values grow slower for random graphs with0<p<1, compared to fixed graphs (p=0) due to the reduced connectivity ofthe graph.

Comparison with Existing Works

The performance of the proposed algorithm was compared with theconventional max consensus algorithm in FIGS. 8-9 and clearly,conventional max consensus algorithm diverges in the presence ofadditive noise.

Additionally, the performance against the soft-max based averageconsensus algorithm (SMA) was compared, as shown in FIG. 10. The softmaximum of a vector x=[x₁, . . . , x_(N)] is denoted as:

${{{s\max}(x)} = {\frac{1}{\beta}\log{\sum\limits_{i = 1}^{N}e^{\beta x_{i}}}}},$

where β>0 is a design parameter. The same network with N=75 isconsidered as in FIG. 3. Nodes were initialized linearly over (0, 1).The design parameter β of the SMA algorithm was considered to beβ{6,10}. The proposed algorithm and the SMA algorithm were applied inthe presence of additive noise

(0,1), distributed over the edges.

The SMA algorithm with β=6 converges faster than with β=10, however, β=6has greater estimation error than β=10. In comparison with SMA, theproposed algorithm performs better in terms of bias and variance of theestimate of true maximum value, and the number of iterations requiredfor convergence.

Computing System

FIG. 11 illustrates an example of a suitable computing system 300 usedto implement various aspects of the present system and methods foranalysis of max consensus algorithms in the presence of additive noise.Example embodiments described herein may be implemented at least in partin electronic circuitry; in computer hardware executing firmware and/orsoftware instructions; and/or in combinations thereof. Exampleembodiments also may be implemented using a computer program product(e.g., a computer program tangibly or non-transitorily embodied in amachine-readable medium and including instructions for execution by, orto control the operation of, a data processing apparatus, such as, forexample, one or more programmable processors or computers). A computerprogram may be written in any form of programming language, includingcompiled or interpreted languages, and may be deployed in any form,including as a stand-alone program or as a subroutine or other unitsuitable for use in a computing environment. Also, a computer programcan be deployed to be executed on one computer, or to be executed onmultiple computers at one site or distributed across multiple sites andinterconnected by a communication network.

Certain embodiments are described herein as including one or moremodules 312. Such modules 312 are hardware-implemented, and thus includeat least one tangible unit capable of performing certain operations andmay be configured or arranged in a certain manner. For example, ahardware-implemented module 312 may comprise dedicated circuitry that ispermanently configured (e.g., as a special-purpose processor, such as afield-programmable gate array (FPGA) or an application-specificintegrated circuit (ASIC)) to perform certain operations. Ahardware-implemented module 312 may also comprise programmable circuitry(e.g., as encompassed within a general-purpose processor or otherprogrammable processor) that is temporarily configured by software orfirmware to perform certain operations. In some example embodiments, oneor more computer systems (e.g., a standalone system, a client and/orserver computer system, or a peer-to-peer computer system) or one ormore processors may be configured by software (e.g., an application orapplication portion) as a hardware-implemented module 312 that operatesto perform certain operations as described herein.

Accordingly, the term “hardware-implemented module” encompasses atangible entity, be that an entity that is physically constructed,permanently configured (e.g., hardwired), or temporarily configured(e.g., programmed) to operate in a certain manner and/or to performcertain operations described herein. Considering embodiments in whichhardware-implemented modules 312 are temporarily configured (e.g.,programmed), each of the hardware-implemented modules 312 need not beconfigured or instantiated at any one instance in time. For example,where the hardware-implemented modules 312 comprise a general-purposeprocessor configured using software, the general-purpose processor maybe configured as respective different hardware-implemented modules 312at different times. Software may accordingly configure a processor 302,for example, to constitute a particular hardware-implemented module atone instance of time and to constitute a different hardware-implementedmodule 312 at a different instance of time.

Hardware-implemented modules 312 may provide information to, and/orreceive information from, other hardware-implemented modules 312.Accordingly, the described hardware-implemented modules 312 may beregarded as being communicatively coupled. Where multiple of suchhardware-implemented modules 312 exist contemporaneously, communicationsmay be achieved through signal transmission (e.g., over appropriatecircuits and buses) that connect the hardware-implemented modules. Inembodiments in which multiple hardware-implemented modules 312 areconfigured or instantiated at different times, communications betweensuch hardware-implemented modules may be achieved, for example, throughthe storage and retrieval of information in memory structures to whichthe multiple hardware-implemented modules 312 have access. For example,one hardware-implemented module 312 may perform an operation, and maystore the output of that operation in a memory device to which it iscommunicatively coupled. A further hardware-implemented module 312 maythen, at a later time, access the memory device to retrieve and processthe stored output. Hardware-implemented modules 312 may also initiatecommunications with input or output devices.

As illustrated, the computing system 300 may be a general purposecomputing device, although it is contemplated that the computing system300 may include other computing systems, such as personal computers,server computers, hand-held or laptop devices, tablet devices,multiprocessor systems, microprocessor-based systems, set top boxes,programmable consumer electronic devices, network PCs, minicomputers,mainframe computers, digital signal processors, state machines, logiccircuitries, distributed computing environments that include any of theabove computing systems or devices, and the like.

Components of the general purpose computing device may include varioushardware components, such as a processor 302, a main memory 304 (e.g., asystem memory), and a system bus 301 that couples various systemcomponents of the general purpose computing device to the processor 302.The system bus 301 may be any of several types of bus structuresincluding a memory bus or memory controller, a peripheral bus, and alocal bus using any of a variety of bus architectures. For example, sucharchitectures may include Industry Standard Architecture (ISA) bus,Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, VideoElectronics Standards Association (VESA) local bus, and PeripheralComponent Interconnect (PCI) bus also known as Mezzanine bus.

The computing system 300 may further include a variety ofcomputer-readable media 307 that includes removable/non-removable mediaand volatile/nonvolatile media, but excludes transitory propagatedsignals. Computer-readable media 307 may also include computer storagemedia and communication media. Computer storage media includesremovable/non-removable media and volatile/nonvolatile media implementedin any method or technology for storage of information, such ascomputer-readable instructions, data structures, program modules orother data, such as RAM, ROM, EEPROM, flash memory or other memorytechnology, CD-ROM, digital versatile disks (DVD) or other optical diskstorage, magnetic cassettes, magnetic tape, magnetic disk storage orother magnetic storage devices, or any other medium that may be used tostore the desired information/data and which may be accessed by thegeneral purpose computing device. Communication media includescomputer-readable instructions, data structures, program modules orother data in a modulated data signal such as a carrier wave or othertransport mechanism and includes any information delivery media. Theterm “modulated data signal” means a signal that has one or more of itscharacteristics set or changed in such a manner as to encode informationin the signal. For example, communication media may include wired mediasuch as a wired network or direct-wired connection and wireless mediasuch as acoustic, RF, infrared, and/or other wireless media, or somecombination thereof. Computer-readable media may be embodied as acomputer program product, such as software stored on computer storagemedia.

The main memory 304 includes computer storage media in the form ofvolatile/nonvolatile memory such as read only memory (ROM) and randomaccess memory (RAM). A basic input/output system (BIOS), containing thebasic routines that help to transfer information between elements withinthe general purpose computing device (e.g., during start-up) istypically stored in ROM. RAM typically contains data and/or programmodules that are immediately accessible to and/or presently beingoperated on by processor 302. For example, in one embodiment, datastorage 306 holds an operating system, application programs, and otherprogram modules and program data.

Data storage 306 may also include other removable/non-removable,volatile/nonvolatile computer storage media. For example, data storage306 may be: a hard disk drive that reads from or writes tonon-removable, nonvolatile magnetic media; a magnetic disk drive thatreads from or writes to a removable, nonvolatile magnetic disk; and/oran optical disk drive that reads from or writes to a removable,nonvolatile optical disk such as a CD-ROM or other optical media. Otherremovable/non-removable, volatile/nonvolatile computer storage media mayinclude magnetic tape cassettes, flash memory cards, digital versatiledisks, digital video tape, solid state RAM, solid state ROM, and thelike. The drives and their associated computer storage media providestorage of computer-readable instructions, data structures, programmodules and other data for the general purpose computing device 300.

A user may enter commands and information through a user interface 340or other input devices 345 such as a tablet, electronic digitizer, amicrophone, keyboard, and/or pointing device, commonly referred to asmouse, trackball or touch pad. Other input devices 345 may include ajoystick, game pad, satellite dish, scanner, or the like. Additionally,voice inputs, gesture inputs (e.g., via hands or fingers), or othernatural user interfaces may also be used with the appropriate inputdevices, such as a microphone, camera, tablet, touch pad, glove, orother sensor. These and other input devices 345 are often connected tothe processor 302 through a user interface 340 that is coupled to thesystem bus 301, but may be connected by other interface and busstructures, such as a parallel port, game port or a universal serial bus(USB). A monitor 360 or other type of display device is also connectedto the system bus 301 via user interface 340, such as a video interface.The monitor 360 may also be integrated with a touch-screen panel or thelike.

The general purpose computing device may operate in a networked orcloud-computing environment using logical connections of a networkinterface 303 to one or more remote devices, such as a remote computer.The remote computer may be a personal computer, a server, a router, anetwork PC, a peer device or other common network node, and typicallyincludes many or all of the elements described above relative to thegeneral purpose computing device. The logical connection may include oneor more local area networks (LAN) and one or more wide area networks(WAN), but may also include other networks. Such networking environmentsare commonplace in offices, enterprise-wide computer networks, intranetsand the Internet.

When used in a networked or cloud-computing environment, the generalpurpose computing device may be connected to a public and/or privatenetwork through the network interface 303. In such embodiments, a modemor other means for establishing communications over the network isconnected to the system bus 301 via the network interface 303 or otherappropriate mechanism. A wireless networking component including aninterface and antenna may be coupled through a suitable device such asan access point or peer computer to a network. In a networkedenvironment, program modules depicted relative to the general purposecomputing device, or portions thereof, may be stored in the remotememory storage device.

CONCLUSION

A practical approach for reliable estimation of maximum of the initialstate values of nodes in a distributed network, in the presence ofadditive noise is proposed. Firstly, it was shown that the existence ofa constant growth rate due to additive noise and then derived upper andlower bounds for the growth rate. It is argued that the growth rate isconstant, and the upper bound is a function of spectral radius of thegraph. By deriving a lower bound, it was proved that the growth rate isalways a positive non-zero real value. Upper and lower bounds on thegrowth rate for random time-varying graphs were derived. An empiricalupper bound is obtained by scaling the original bound, which is shown tobe tighter and generalizable to different networks and noise settings.Finally, a fast max-based consensus algorithm was presented, which isrobust to additive noise and showed that the variance of the growth rateestimator used in this algorithm decreases as

(t_(max) ⁻¹) using concentration inequalities. It was also shown thatthe variance of the estimator scales linearly with the diameter of thenetwork. Simulation results corroborating the theory were also provided.

It should be understood from the foregoing that, while particularembodiments have been illustrated and described, various modificationscan be made thereto without departing from the spirit and scope of theinvention as will be apparent to those skilled in the art. Such changesand modifications are within the scope and teachings of this inventionas defined in the claims appended hereto.

What is claimed is:
 1. A distributed sensor network system, comprising:a plurality of N sensor nodes, each sensor node i∈N assuming an assignedor measured state value x_(i)(t); a processor that estimates a finalstate value maximum of a plurality of state values x_(i)(t) respectivelyproduced by each sensor node i of the plurality of N sensor nodes of thedistributed sensor network; wherein to estimate the final state valuemaximum of the plurality of state values, the processor: determines agrowth rate estimate λ_(i) associated with each respective sensor node iof the plurality of N sensor nodes, wherein to determine the growth rateestimate λ_(i) the processor: initializes the state value x_(i)(t) ofeach sensor node i of the plurality of N sensor nodes to zero such thatx_(i)(0)=0 for all i∈N; and updates the state value x_(i)(t) of eachsensor node i for t_(max) iterations with a local maximum of the statevalues x_(i)(t) and x_(j)(t−1) of the sensor node i and one or moreneighboring sensor nodes j; wherein the growth rate estimate A isdescribed by${{{\hat{\lambda}}_{i}( t_{\max} )} = \frac{x_{i}( t_{\max} )}{t_{\max}}};$determines a true state value maximum of a plurality of true state valuemaxima for each respective sensor node i of the plurality of N sensornodes at each iteration t of a plurality of t_(max) iterations togenerate a set of true state maxima, wherein to determine the true statevalue maximum the processor: measures an initial state value x_(i)(0) byeach sensor node i; and updates the state value x_(i)(t) of each sensornode i for t_(max) iterations with a local maximum of the state valuesx_(i)(t) and x_(j)(t−1) of the sensor node i and one or more neighboringsensor nodes j; wherein the growth rate estimate λ_(i) is removed fromeach state value x_(i)(t); and selecting a final state value maximumfrom the set of true state value maxima.
 2. The distributed sensornetwork system of claim 1, wherein the one or more state valuesassociated with the one or more neighboring sensor nodes of theplurality of sensor nodes include additive noise.
 3. The distributedsensor network system of claim 1, wherein to determine a maximum of astate value x_(i)(t) associated with the sensor node i and one or morestate values x_(j)(t) associated with one or more neighboring sensornodes j of the plurality of N sensor nodes for an iteration t of theplurality of t_(max) iterations, the processor: obtains a state valuex_(i)(t) measured by the sensor node i at iteration t; obtains one ormore state values x_(j)(t) associated with one or more neighboringsensor nodes j from a previous iteration t−1; compares the state valuex_(i)(t) measured by the sensor node i with each of the one or morestate values x_(j)(t) associated with one or more neighboring sensornodes j; and selects the maximum from the state value x_(i)(t) measuredby the sensor node i and each of the one or more state values x_(j)(t)associated with one or more neighboring sensor nodes j.
 4. Thedistributed sensor network system of claim 1, wherein the growth rateestimate λ_(i) is representative of a constant slope in max-basedconsensus measurement for a sensor node i of the plurality of N sensornodes due to additive noise.
 5. The distributed sensor network system ofclaim 1, wherein the measured initial state value x_(i)(t) associatedwith a sensor node i of the plurality of N sensor nodes is a valuemeasured by the sensor node prior to a first iteration t=1 of theplurality of t_(max) iterations.
 6. The distributed sensor networksystem of claim 1, wherein each connection between neighboring sensornodes i and j contributes additive noise vi,j(t) to each state valuex_(i)(t) and x_(j)(t) of the neighboring sensor nodes i and j, wherei,j∈N.
 7. The distributed sensor network system of claim 1, wherein todetermine the growth rate estimate, the step of updating the state valuex_(i)(t) of each sensor node i for t_(max) iterations with a localmaximum of the state values x_(i)(t) and x_(j)(t−1) of the sensor node iand one or more neighboring sensor nodes j is such that:${x_{i}(t)} = {{\max( {{x_{i}(t)},{\max\limits_{j \in \mathcal{N}_{i}}( {{x_{j}( {t - 1} )} + {v_{ij}( {t\  - 1} )}} )}} )}.}$8. The distributed sensor network system of claim 1, wherein todetermine the true state value maximum, the step of updating the statevalue x_(i)(t) of each sensor node i for t_(max) iterations with a localmaximum of the state values x_(i)(t) and x_(j)(t−1) of the sensor node iand one or more neighboring sensor nosed j is such that:${x_{i}(t)} = {{\max( {{x_{i}(t)},{\max\limits_{j \in \mathcal{N}_{i}}( {{x_{j}( {t - 1} )} + {v_{ij}( {t\  - 1} )}} )}} )} - {{{\hat{\lambda}}_{i}( t_{\max} )}.}}$9. The distributed sensor network system of claim 1, wherein the finalstate value maximum is a maximum of the set of true state value maxima.10. A distributed sensor network system comprising: a plurality of Nsensor nodes, each sensor node i∈N assuming an assigned or measuredstate value x_(i)(t); a processor that estimates a final state valuemaximum of a plurality of state values x_(i)(t) respectively produced byeach sensor node i of the plurality of N sensor nodes of the distributedsensor network; wherein to estimate the final state value maximum of theplurality of state values x_(i)(t), the processor: determines a growthrate estimate λ_(i) associated with each respective sensor node i of theplurality of N sensor nodes; determines a true state value maximum of aplurality of true state value maxima for each respective sensor node iof the plurality of N sensor nodes at each iteration t of a plurality oft_(max) iterations to generate a set of true state maxima by removingthe growth rate estimate λ_(i) associated with each respective sensornode i from the respective state values x_(i)(t) of each sensor nodeλ_(i) of the plurality of N sensor nodes; and selects a final statevalue maximum from the set of true state value maxima.
 11. The system ofclaim 10, wherein to determine the growth rate estimate λ_(i) theprocessor: initializes the state value x_(i)(t) of each sensor node i ofthe plurality of N sensor nodes to zero such that x_(i)(0)=0 for alli∈N; and updates the state value x_(i)(t) of each sensor node i fort_(max) iterations with a local maximum of the state values x_(i)(t) andx_(j)(t−1) of the sensor node i and one or more neighboring sensor nodesj; wherein the growth rate estimate λ_(i) is described by${{\hat{\lambda}}_{i}( t_{\max} )} = {\frac{x_{i}( t_{\max} )}{t_{\max}}.}$12. The system of claim 10, wherein to determine the true state valuemaximum the processor: measures an initial state value x_(i)(0) by eachsensor node i; and updates the state value x_(i)(t) of each sensor nodei for t_(max) iterations with a local maximum of the state valuesx_(i)(t) and x_(j)(t−1) of the sensor node i and one or more neighboringsensor nodes j.
 13. The system of claim 12, wherein to update the statevalue x_(i)(t) of each sensor node i for t_(max) iterations with a localmaximum of the state values x_(i)(t) and x_(j)(t−1) of the sensor node iand one or more neighboring sensor nodes, the processor: determines aninitial state value maximum x_(i)(t) using the state value x_(i)(0)associated with the sensor node i and one or more measured state valuesx_(j)(t) associated with one or more neighboring sensor nodes of theplurality of N sensor nodes; removes the growth rate estimate λ_(i)associated with the sensor node i from the initial state value maximumx_(i)(t) to obtain the true state value maximum for the sensor node i ofthe plurality of sensor nodes; and assigns the true state value maximumfor each sensor node of the plurality of sensor nodes.
 14. The system ofclaim 12, wherein the updated state value x_(i)(t) is given by:${x_{i}(t)} = {{\max( {{x_{i}(t)},{\max\limits_{j \in \mathcal{N}_{i}}( {{x_{j}( {t - 1} )} + {v_{ij}( {t\  - 1} )}} )}} )} - {{{\hat{\lambda}}_{i}( t_{\max} )}.}}$15. The system of claim 6, wherein to determine the growth rate estimateλ_(i) the processor: determines an upper bound on a growth rate estimateλ_(i) based on a spectral radius of the plurality of N sensor nodes. 16.A method for determining max-consensus of a plurality of nodes in adistributed network system, comprising: providing a network including Nsensor nodes, each sensor node i assuming an assigned or measured statevalue x_(i)(t) and each connection between neighboring sensor nodes iand j assuming additive noise v_(i,j)(t), where i,j∈n; determining agrowth rate estimate λ_(i) associated with each respective sensor nodei; determining a true state value maximum for each respective sensornode i of the plurality of N sensor nodes for each iteration t of aplurality of t_(max) iterations to generate a set of true state valuemaxima; and selecting a final state value maximum from the set of truestate value maxima.
 17. The method of claim 16, wherein the step ofdetermining the growth rate estimate λ_(i) associated with eachrespective sensor node i further comprises: initializing the state valuex_(i)(t) of all sensor nodes to zero such that x_(i)(0)=0; updating thestate value x_(i)(t) of each sensor node i with a local maximum fort_(max) iterations such that${{x_{i}(t)} = {\max( {{x_{i}(t)},{\max\limits_{j \in \mathcal{N}_{i}}( {{x_{j}( {t - 1} )} + {v_{ij}( {t\  - 1} )}} )}} )}};$and determining the growth rate estimate λ_(i) such that${{\hat{\lambda}}_{i}( t_{\max} )} = {\frac{x_{i}( t_{\max} )}{t_{\max}}.}$18. The method of claim 16, wherein the step of determining the truestate value maximum further comprises: measuring initial state valuex_(i)(0) by each sensor node i; and updating each sensor node i with alocal maximum for t_(max) iterations and subtracting growth rateestimate λ_(i) such that${x_{i}(t)} = {{\max( {{x_{i}(t)},{\max\limits_{j \in \mathcal{N}_{i}}( {{x_{j}( {t - 1} )} + {v_{ij}( {t\  - 1} )}} )}} )} - {{{\hat{\lambda}}_{i}( t_{\max} )}.}}$19. The method of claim 16, wherein each connection between neighboringsensor nodes i and j in the network contributes additive noise vi,j(t)to each state value x_(i)(t) and x_(j)(t) of the neighboring sensornodes i and j, where i,j∈N.
 20. The method of claim 16, wherein the stepof determining the growth rate estimate λ_(i) further comprises:determining an upper bound on a growth rate estimate λ_(i) based on aspectral radius of the plurality of N sensor nodes of the network.